A problem that I tackled in my research is the lackluster support for evidence-based interventions in education.

I often observed that programs that match policy-makers’ intuitions about how school should look receive more support for implementation than programs with strong evidence that don’t resemble policy-makers’ expectations and intuitions.

So the first step in this program was to document empirically whether such a bias against atypical educational interventions existed.

And I did this in a two-by-two experiment. And I explored three explanations and solutions for the bias against atypical programs in education:

  1. Lack of training
  2. Threat to career/identity
  3. Stereotypical thinking (i.e., heuristics)

Support for Different Types of Interventions

Policy makers are basing their support for educational programs on evidence quality and educational stereotypes to almost the same degree.

Variable Estimate SE t 95% CI
Intercept 3.78 0.06 68.42 [3.66, 3.89]
Evidence Quality (Contrast) 0.18 0.04 4.77 [0.11, 0.25]
Stereotypicality (Contrast) 0.15 0.04 4.04 [0.10, 0.24]
Evidence Quality (Contrast) x Stereotypicality (Contrast) -0.01 0.04 -0.37 [-0.09, 0.05]

Results are from hierarchical linear model nesting by survey year. Overall Mean(SD) of Support = 3.84(0.98)

Grant Funding for School Interventions

These effects even hold up when we look at grant funding:

Grant Funding as a function of evidence quality and sterotypicality conditions

Variable Estimate SE t 95% CI
Intercept 3.18 0.05 63.61 [3.07, 3.28]
Evidence Quality (Contrast) 0.13 0.03 3.99 [0.08, 0.20]
Stereotypicality (Contrast) 0.11 0.03 3.46 [0.06, 0.18]
Evidence Quality (Contrast) x Stereotypicality (Contrast) 0.01 0.03 0.18 [-0.06, 0.06]

Results are from hierarchical linear model nesting by survey year. Overall Mean(SD) of Grant Support = 3.21(0.81)

What are possible explanations for this gap in support that migth point toward solutions?

Lack of training in research methods?

Breakdown of sample by training levels:

highest degree n
PhD 445
MA 249
BA 42


Phd Subsample (N = 382)

Variable Estimate SE t 95% CI
Intercept 3.73 0.05 70.70 [3.62, 3.84]
Evidence Quality (Contrast) 0.23 0.05 4.38 [0.13, 0.33]
Stereotypicality (Contrast) 0.16 0.05 2.98 [0.05, 0.26]
Evidence Quality (Contrast) x Stereotypicality (Contrast) -0.06 0.05 -1.21 [-0.17, 0.04]

Master’s (MA) subsample (N = 242)

Variable Estimate SE t 95% CI
Intercept 3.77 0.06 62.38 [3.65, 3.89]
Evidence Quality (Contrast) 0.17 0.06 2.89 [0.06, 0.29]
Stereotypicality (Contrast) 0.21 0.06 3.43 [0.09, 0.32]
Evidence Quality (Contrast) x Stereotypicality (Contrast) 0.00 0.06 0.03 [-0.12, 0.12]

Qualitative vs. Quantitative Training?

Methodology n
Qual 178
Quant 151
NA 116

The only insight from exploring Qual vs. Quant distinction among PhDs is that Quants differ in heavily penalizing the atypical program with weak evidence, but are otherwise unmoved by evidence, surprisingly.

So regardless of having a master’s or phd in quantitative or qualitative methods people still show similar levels of bias to the full sample, which would suggest that a lack of education or training is not the problem.

Might new atypical approaches represent a threat to the careers/identities of established researchers?

If so then we would expect to see biased evaluation of evidence supporting threatening findings.

Variable Estimate SE t 95% CI
Intercept 3.98 0.05 87.98 [3.89, 4.07]
Evidence Quality (Contrast) 0.62 0.05 13.80 [0.54, 0.71]
Stereotypicality (Contrast) 0.05 0.05 1.05 [-0.04, 0.14]
Evidence Quality (Contrast) x Stereotypicality (Contrast) 0.03 0.05 0.57 [-0.06, 0.11]

Perhaps reviewers weight the same evidence differently when determining their actual support?

Experimental methods targeting bias against atypical school interventions:

Both the precommitment and teacher-retraining conditions represent significant improvements above baseline, t(145.69) = 2.24, p= 0.026 and t(88.31) = 2.09, p= 0.040, respectively.